Note 6 Functional Dependency Theory
Author: Zhen Tong 120090694@link.cuhk.edu.cn
Lecturer: Jan YOU
To better understand this chapter, you are encourage to play the functional dependency calculator with lab2 on my githubヾ(≧▽≦*)o
Closure of a FDs set
In the last note we learned the concept of closure of a set of functional dependencies. Now let’s see how to derive a
We can compute by repeatedly applying Armstrong’s Axioms.
- Reflexive rule: if , then
- Augmentation rule: if , then
- Transitivity rule: if
According to the Armstrong’s Axioms there are 3 additional rules inferred.
- Union rule: If holds and holds, then
- Decomposition rule: If , then holds and holds
- Pseudotransitivity rule:If holds and , then
The following computes the closure of a set of functional dependencies F
Closure of Attribute Sets
Given a set of attributes , define the closure of under F (denoted by) as the set of attributes that are functionally determined by under F
The pseudo-code is like:
def sub_closure(self, sub_attrs:set):
'''
Given a set of attributes alpha, define the closure of alpha+ under F
'''
result = set()
result_prime = set(a for a in sub_attrs)
while len(result) != len(result_prime):
# copy the prime to origin
result = set(a for a in result_prime)
# print("result = ", result)
for fd in self.FDs:
if fd.alpha.issubset( result) :
result_prime = result_prime.union(fd.beta)
# print("Unioned:", result_prime)
print("result", result)
return result
Example of Attribute Set Closure
We can find the Closure of Attribute Sets with the function:
There are several uses of the attribute closure algorithm:
- Testing for superkey:
- To test if is a superkey, we compute , and check if contains all attributes of R
- Testing functional dependencies
- To check if a functional dependency holds (or, in other words,
is in F+), just check if
- That is, we compute + by using attribute closure, and then check if it contains
- To check if a functional dependency holds (or, in other words,
- Or simply we just want to compute the sub-closure among the sub-attributes.
Canonical Cover
A canonical cover is a minimal and simplified set of functional dependencies that has the same closure as the original set of functional dependencies on a relation schema. It is used to make it more efficient to ensure that functional dependencies are not violated when updates are performed on the database.
In one word, canonical cover is summary the essence of a set of FD
- Ensuring FD Satisfaction: When updates (inserts, updates, deletes) are performed on the database, the database system needs to ensure that these updates do not violate any of the functional dependencies in the set F. If an update would cause a violation, it must be rolled back to maintain data consistency.
- Canonical Cover for Efficiency: To make the process of checking for violations more efficient, you can create a simplified set of functional dependencies known as the "canonical cover." The canonical cover contains only the essential functional dependencies required to represent the same closure (the set of all FDs derivable from F).
Extraneous Attributes: To construct the canonical cover, you must identify and eliminate extraneous attributes. An attribute in a functional dependency in F is extraneous if you can remove it without changing the closure of F. In other words, it doesn't contribute to the uniqueness of the attribute values.
Test if an Attribute is Extraneous
Let R be a relation schema and let F be a set of functional dependencies that hold on R . Consider an attribute in the functional dependency
To test if attribute is extraneous in
- Consider the set that removing from the FD
- Check that contains A under
- If does, A is extrainesou
To test if attribute is extraneous in
- Let . Check if can be inferred from F
- Compute using the dependencies in
- If inclurdes all attributes in , then A is extraneous in
In python the code is:
def is_extranious(self, fd:FunctionalDependency, A:set, determinant:bool):
alpha = copy.deepcopy(fd.alpha)
beta = copy.deepcopy(fd.beta)
if determinant:
gamma = alpha.difference(A)
gamma_plus = self.sub_closure(gamma)
if beta.issubset(gamma_plus):
return True
else:
return False
else:
F= set(i for i in self.FDs)
F_prime = (F.difference({fd})).union({FunctionalDependency(alpha, beta.difference(A))})
alpha_plus = self.sub_closure(alpha, F_prime)
if A.issubset(alpha_plus):
return True
else:
return False
Now we can compute the Canonical Cover
A canonical cover for F is a set of dependencies Fc such that:
- logically implies all dependencies in
- logically implies all dependencies in
- No functional dependency in contains an extraneous attribute
- Each left side of functional dependency in is unique.
- There are no two dependencies in :
and such that
- There are no two dependencies in :
To compute a canonical cover for F
The first thing during the loop is satisfying the fourth rule in Canonical Cover. And whenever there is a new functional dependency established based on that, the old two functional dependencies are useless.
The second thing during the loop is satisfying the third rule in Canonical Cover.
Dependency Preservation
If we decompose the relation schema into , the restriction of the functional dependency set to is the set of all functional dependencies in F+that include only attributes of R
- Note that the definition of restriction uses all dependencies in F+, not just those in F
A decomposition is dependency-preserving, if
BCNF Decomposition Algorithm
In the algorithm above, we need to first check whether the decomposed relational schema in the results is BCNF. We can do any one of the two things:
- Naively, you check the BCNF condition in the new closure (restriction). Given a decomposed relational schema , derive the restriction with respect to the . Then check the BCNF condition for the restriction set.
- Recap: the BCNF check is do the for loop for every functional dependency in the functional dependency set that:
- is trivial
- is a superkey for
- Recap: the BCNF check is do the for loop for every functional dependency in the functional dependency set that:
- Or you can do this: use the original set of dependencies F that hold on R, but with the
following test:- For every attribute combination , check that (the attribute closure of ) either includes no attribute of , or includes all attributes of
- If includes none of the attributes of that means is not a (non-trivial) determinant within Ri
- If includes all of the attributes of Ri, then is a superkey of R
Then we come back to the BCNF Decomposition Algorithm, after we detect one functional dependency is violating the BCNF in the , we do the decomposition of into and
Did you notice that the algorithm doesn’t check the Dependency Preservation, it doesn’t merge all the restriction of the to see if the holds.
BCNF or 3NF
There are some situations where
- BCNF is not dependency preserving, and
- efficient checking for FD violation on updates is important
We can use the Third Normal Form (3NF)
- Allows some redundancy
- But functional dependencies can be checked on individual relations without computing a join
- There is always a lossless-join, dependency-preserving decomposition into 3NF
3NF Decomposition Algorithm
Before we introduce the 3NF Decomposition Algorithm, let’s first recap the 3NF definition:
- A relation R is in 3NF if for all in , it is not a transitive functional dependency, which means at least one of the following holds:
- is trivial because a trivial dependency is not a transitive functional dependency.
- is a superkey for , because the determinant in transitive functional dependency is a non-primary attribute. If is a superkey, then, it cannot be partial FD, and cannot be a non-primary attribute.
- Each attribute A Each attribute A in is contained in a candidate key for R. For example, and are all candidate key, but exclusive. (NOTE: the third condition does not say that a single candidate key must contain all the attributes in ; each attribute A in may be contained in a different candidate key)
We know BCNF is a sufficient set of 3NF, i.e. if a relational schema is a BCNF, it is a 3NF. Now let’s have a look at the 3NF decomposition.